Overview of the International Authorship Identification Competition at PAN-2011

نویسندگان

  • Shlomo Argamon
  • Patrick Juola
چکیده

This paper gives an overview of the evaluation methodology applied to authorship identification solutions as part of PAN 2011. The two variations of authorship identification that were explored were authorship attribution, determining which of a known set of authors wrote a text, and authorship verification, determining if a specific authors did or did not write a text. We summarize the methods used by the various participants, which were quite varied, and present the overall results of the evaluation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Authorship Identification with Modality Specific Meta Features - Notebook for PAN at CLEF 2011

This paper presents the approach used in the PAN ’11 authorship identification competition. Our method extracts meta features from several independently generated clustering solutions from the training set. Each clustering solution uses a disjoint set of features that represent a specific linguistic modality. The different clustering solutions encode similarities in writing styles of authors ac...

متن کامل

Authorship Identification in Large Email Collections: Experiments Using Features that Belong to Different Linguistic Levels - Notebook for PAN at CLEF 2011

The aim of this paper is to explore the usefulness of using features from different linguistic levels to email authorship identification. Using various email datasets provided by PAN’11 lab we tested several feature groups in both authorship attribution and authorship verification subtasks. The selected feature groups combined with Regularized Logistic Regression and One-Class SVMmachine learni...

متن کامل

Authorship Verification Using the Impostors Method Notebook for PAN at CLEF 2013

This paper describes the evaluation of the GenIM method, which participated in the PAN' 13 authorship identification competition. The approach is based on comparing the similarity between the given documents and a number of external (impostor) documents, so that documents can be classified as having been written by the same author, if they are shown to be more similar to each other than to the ...

متن کامل

Authorship Identification of E-mail as a Multi-Class Task - Notebook for PAN at CLEF 2011

In this paper, we describe a multi-class text categorization approach to authorship attribution and test it on sets of e-mail collections. The PAN 2011 competition data consists of e-mails of variable length, written by various candidate authors, with some represented by significantly longer or more e-mails than others. Rather than construct a classifier for each separate author to discriminate...

متن کامل

EPSMS and the Document Occurrence Representation for Authorship Identification - Notebook for PAN at CLEF 2011

This paper describes the participation of the PISIS team in the authorship identification track of PAN’11. We adopted two different strategies for the tasks of authorship attribution and authorship verification. For authorship attribution we performed experiments with a document occurrence representation using a standard classification-based approach. Results obtained with this approach were mi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011